The dynamical strength of social ties in information spreading 
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We investigate the temporal patterns of human communication and its influence on the spreading 
of information in social networks. The analysis of mobile phone calls of 20 million people in one 
country shows that human communication is bursty and happens in group conversations. These 
features have opposite effects in information reach: while bursts hinder propagation at large scales, 
conversations favor local rapid cascades. To explain these phenomena we define the dynamical 
strength of social ties, a quantity that encompasses both the topological and temporal patterns of 
human communication. 



Quantitative understanding of human communication 
patterns is of paramount importance to explain the dy- 
namics of many social, technological and economic phe- 
nomena [IH1] . Most studies have focused on the study of 
the complex topological patterns of the underlying contact 
network (whom we talk to) and its influence in the prop- 
erties of spreading phenomena in social networks such 
diffusion of information, innovations, computer viruses, 
opinions, etc. [2J. Paradoxically, most of these studies 
of dynamical phenomena on social networks neglect the 
temporal patterns of human communications: humans act 
in bursts or cascades of events [5]-[8] , most ties are not per- 
sistent 110] and communications happen mostly in the 
form of group conversations [81 11H - FT5] . However, since 
information transmission and human communication are 
concurrent, the temporal structure of communication 
must influence the properties of information spreading. 
Indeed, recent experiments of electronic recommendation 
forwarding |14j and simulations of epidemic models on 
email and mobile databases [HUH] found that the asymp- 
totic speed of information spreading is controlled by the 
bursty nature of human communications that leads to 
a slowing down of the diffusion. However, although the 
asymptotic speed is an important property of the propa- 
gation of information in social networks, there is still no 
general understanding of how and what temporal prop- 
erties of human communication do influence spreading 
processes and in turn, how they affect the very definition 
of social interaction. 

The answer to this question can be framed in the more 
general problem of how to model dynamical social net- 
works [9] [16]. In most studies, real temporal activity is 
aggregated over time giving a static snapshot of the social 
interaction where ties are described by static strengths 
which do not include information about the temporal as- 
pects of how humans interact. Temporal and topological 
aspects are therefore disentangled in the analysis. In this 
letter we merge both aspects in the case of information 
diffusion by adopting a functional definition of the social 
ties using the well-known map between dynamical epi- 



demic models and static percolation |17) . The network 
is still described by a static graph, but the interaction 
strength between individuals now incorporates the causal 
and temporal patterns of their communications and not 
only on the intensity [18] . 

To this end we study the mobile communication pat- 
terns from a European operator in a single country over 
a period of 11 months. The data consists of 2 x 10 7 
phone numbers and 7 x 10 8 communication ties for a to- 
tal of 9 billion calls between users. Call Detail Record 
(CDR) contains the hashed number of the caller and the 
receiver, the time when the call was initiated and the du- 
ration of the call. We consider only events in which the 
caller and the callee belong to the operator under con- 
sideration, because of the partial access to the records 
of other operators. Our data for the connectivity of the 
social network, the duration of the calls, etc. are very 
similar to the ones reported in previous studies |18) . 

Firstly, we investigate the communication temporal 
patterns that might affect information diffusion. Spread- 
ing from user i to user j (i — > j) happens at the relay 
time intervals ry (also called inter-contact time [5]), i.e. 
the time interval it takes to i to pass on to j an informa- 
tion he/she got from any another person * — > i (where 
j / *, see Fig. [I]). Information spreading is thus de- 
termined by the interplay between and the intrinsic 
timescale of the infection process. Note that Ty depends 
on the correlated and causal way in which group con- 
versations happen, since it depends on the inter-event 
intervals Stij in the i — > j communication but also on the 
possible temporal correlation with the * — > j events [T7] . 
Ignoring this correlation, it is possible to approximate 
the probability distribution function (pdf) for ry by the 
waiting-time density for Stij [6l I15j 

P(n 3 ) = =L f P(5t l0 )d5t lv (l) 

Otij J nj 

where Stij is the average inter-event time. In this approx- 
imation, the dynamics of the transmission process only 
depends on the dyadic i — > j sequence of communication 
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events and in particular, the possible heavy-tail proper- 
ties of P(Stij) are directly inherited by P(nj). Fig. [2] 
shows our (rescaled) results for P(Stij) and P(nj). For 
comparison, we also show the results obtained when i) 
the time-stamps of the * — > i events are randomly se- 
lected from the complete CDR, thus destroying any possi- 
ble temporal correlation with i — > j and effectively mim- 
icking Eq. ([I]) and ii) when the whole CDR time-stamps 
are shuffled thus destroying both tie temporal patterns 
and correlation between ties. Both shufflings preserve the 
tie intensity Wij |18j . i.e. the number of calls and their 
duration and also the circadian rhythms of human com- 
munication [15J . The result for P(Stij) shows that small 
and large inter-event times are more probable for the real 
series than for the shuffled ones, where the pdf is almost 
exponential as in a Poissonian process, apart from a small 
deviation due to the circadian rhythms. This bursty pat- 
tern of activity has been found in numerous examples 
of human behavior [6] and seems to be universal in the 
way a single individual schedules tasks. Here we see that 
it also happens at the level of two individuals interac- 
tion confirming recent results in mobile [15| and online 
communities [7] dynamics. The pdf for is also heavy- 
tailed but displays a larger number of short Tij compared 
to the shuffled one. The abundance of short suggests 
that receiving an information (* — > i) triggers commu- 
nication with other people (i — > j), a manifestation of 
group conversations [TTlll3| . While the fat-tail of P(Tij) 
is accurately described by Eq. 0, i.e. large transmission 
intervals are mostly due to large inter-event commu- 
nication times in the i — > j tie, the behavior of P{jij) is 
not only due to the bursty patterns of Stij , but also to the 
temporal correlation between the i — > j and the * — > i 
events. In fact, if the correlation between the i — >■ j and 
the * — > i series is destroyed, the probability of short- 
time intervals decreases and approaches the Poissonian 
case (Fig. [2]). In summary, relay times depend on two 
main properties of human communication that compete 
to one another. While the bursty nature of human ac- 
tivity yields to large transmission times hindering any 
possible infection, group conversations translate into an 
unexpected abundance of short relay times, favoring the 
probability of propagation. 

To investigate the effect of these two conflicting prop- 
erties of human communication on information spread- 
ing, we simulate the epidemic Susceptible-Infectious- 
Recovered (SIR) model in our social network considering 
the real time sequence of communication events [HI [23] 
and compare them to the shuffled data. We start the 
model by infecting a node at a random instant and con- 
sidering all other nodes as susceptible. In each call an 
infected node can infect a susceptible node with prob- 
ability A. Due to the synchronous nature of the phone 
communication, this happens regardless of who initiates 
the call. However, since the same results are obtained 
by considering directionality in the calls, for computa- 
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FIG. 1. (color online) Schematic view of communications 
events around individual i: each horizontal segment indicates 
an event between i — > j (top) and * — > i (bottom). At each 
t a in the * — > i time series, Tij is the time elapsed to the next 
i — > j event, which is different from the inter-event time Stij 
in the i — > j time series. The red shaded area represents the 
recover time window T; after t a - 




FIG. 2. (color online) Distribution of the relay time inter- 
vals nj (main) and of the inter-event times Stij (inset) in the 
i j tie rescaled by Stij. The black circles correspond to 
the real data, while the red squares is the overall-shuffled re- 
sult. Blue diamonds correspond to the case in which only the 
* — > i sequence is randomized. Only ties with Wij > 10 are 
considered. In both graphs the dashed line correspond to the 
e~ x function. 

tional reasons we consider the latter case. Nodes remain 
infected during a time Tj until they decay into the re- 
covered state. For the sake of simplicity we simulate the 
simplest model in which the recovering time T, is deter- 
ministic and homogeneous Ti = T and set T = 2 days, 
although different and/or stochastic Ti can be studied 
within the same model. The spreading dynamics gener- 
ates a viral cascade that grows until there are no more 
nodes in the infected state. We repeat the spreading pro- 
cess for 3 x 10 4 randomly chosen seeds. Note that our 
model includes the SI model simulations in [15] where 
A = 1 and T = To, with To being the total duration of 
the dataset. 

By looking at the size of the largest cascade s max (over 
all realizations) at each value of A, we first ensure the 
existence of a percolation transition [3] (see Fig. [3]) , con- 
firmed by a change in the behavior of s ma x from small 
to large cascades at a given value of A (tipping point). 
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FIG. 3. (color online) Average size dynamics for a large (a) 
and a small (b) value of A (left) and maximum size (right) 
of the infection outbreaks (over 10 4 realizations) for the real 
data (black lines) and shuffled data (red lines) for T — 2 
days. The dashed line shows the critical point estimation 
of the percolation transition given by Ri[X,T] — 1 with R\ 
calculated using Eq. j6j. 
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FIG. 4. Ratio of the number of events (a) and probability 
of no events (b) as a function of the recovery time T for the 
real (black circles) and shuffled * — > i (red squares) data with 
respect to the overall-shuffled data. Right panel (c) shows the 
ratio of the average size of the outbreaks (black circles) and 
of i?i calculated using Eq. [6] (dashed blue line). 



The same behavior is observed for the shuffled-time data 
where the transition seems to happen almost at the same 
value of A, although an accurate analysis of the percola- 
tion point if beyond of the scope of this letter. On the 
contrary, there is a significant difference in the behavior 
of the asymptotic average size Sqo between the real and 
the shuffied-time data for different regimes of A: when A 
is small, Soo is bigger for the real data than for the shuf- 
fled one, while the opposite behavior is observed for large 
A. This difference, that can be very large for moderate 
values of A, shows the impact of the real time dynamics 
of communication in the reach of information in soci- 
ety. Specifically, if information propagates easily (large 
A), the average reach in social networks is narrower than 
the one expected when a Poissonian dynamics is consid- 
ered. In this sense, temporal patterns make social net- 
works bigger than expected at large scales. However, in 
most real situations A is very small [H] and in this case 
the observed behavior is the opposite: despite the low 
propagation, information cascades are larger in real data 
than in the Poisson case, which suggests that information 
spreading is more efficient at small (local) scales. 

To understand this behavior, we follow the approach 
of |17j mapping the dynamical SIR model to a static 
edge percolation model where each tie is described by 
the transmissibility 7y, that represents the probability 
that the information is transmitted from i to j and is a 
function of A and T. If user i becomes infected at time 
t a and the number of communication events i — > j in the 
interval [t a ,t a + T] is riij(ta), then the transmissibility 
in that interval is (see FigjlJ %j = 1 - (1 - A)"^'"'. 
User i may become infected at any * — m communication 
event. Assuming these events independent and equally 



probable, we can average %j over all the t a events to get 
T ij [X,T\ = (l-(l-X)^^) a . (2) 

If the number of * — > i events is large enough we could 
use a probabilistic description of Eq.([2| in terms of the 
probability Pijiij — n; T) that the number of communi- 
cation events between i and j in a given time interval T 
is n. Thus 

oo 

Tij [A, T] =Y,P( nij =n;T)[l - (1 - A)"], (3) 

n=0 

which in principle can be non symmetric (%j ^ Tji). 
This quantity represents the real probability of infection 
from i to j and defines the dynamical strength of the tie. 
Note that %j depends on the series of communication 
events between i and j, but also on the time series of 
calls received by i. In [17] Newman studied the case in 
which both time series are given by independent Poisson 
processes in the whole observation interval [0, To]. Thus, 
P(jiij = n; T) is the Poisson distribution with rate pij — 
WijT/To, where w tJ is total number of calls from i to j 
in [0, To], thus 

% [A, T] = 1 - e - Ap = 1 - e - Xw 'i T/To , (4) 

which shows the one-to-one relationship between the in- 
tensity Wij and the transmissibility Tlj in the Poissonian 
case: the more intense the communication is, the larger 
the probability of infection. However, as we have seen in 
Fig. [2] the real i — > j and * — >• i scries are far from being 
independent and Poissonian and in order to investigate 
the effect of real patterns of communication on the trans- 
missibility we approximate Eq. p]). For small values of A 
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we have 1 — (1 — A)™ ~ An, while when A ~ 1 we get that 
1 — (1 — A) n ~ 1 for n > 0. Thus, the transmissibility for 
the two regimes is given by: 



X(n 
1 



%0/to, 

pp. 

ij 



when A <C 1 
when A ~ 1 



(5) 



where P° = P{n l} = 0;T). Specifically, PP can be 
estimated directly from Eq. (fll) for each link PP = 



P(Tij)d,Tij, since it measures the probability to find a 
relay time bigger than T. Fig. [4] shows the comparison of 
riij and PP (averaged over all links) for different values 
of T for the real and shuffled data (denoted by tilde). On 
one side, due to the correlation between the * — > i and 
i — > j time series, the number of events in a tie following 
an incoming call is always larger for the real data than 
for the shuffled one. This is the reason why, for small A, 
the average transmissibility (and thus the size of the epi- 
demic cascades) is always higher in real communication 
patterns [H] . On the contrary, the bursty nature of the 
% — > j communication makes the tail for the real P(Tjj) 
heavier than the exponential distribution found in the 
shuffled data. Thus if T is large enough, PP is larger in 
the real than in the shuffled data and this is why we ob- 
serve smaller cascades in that region. Note however that 
this does not apply for very small values of T (T < 1 
days), where the causality between * — > i and the i — > j 
time series can make Po even smaller in the real case. 

To give a more quantitative analysis of the observed 
behavior we investigate the percolation process in a so- 
cial network in which links have transmissibility %j . The 
important quantity is the secondary reproductive num- 
ber Ri, that is the average number of secondary infec- 
tions produced by an infectious individual. Pi gives in- 
formation about percolation transition in the SIR process 
(which happens at Pi = 1 [UJ), but also about the speed 
of diffusion (which is proportional to Pi [20]) and of the 
size of the cascades (which is a growing function of Pi 
[IT]). Assuming that the %j are given and that the so- 
cial network is random in any other respect, Pi can be 
approximated as 



Pi[A,P] = 



(E-j Tij)i 



(6) 



Note that in the homogeneous case in which 7y = T 
we recover the common result in random networks Pi = 
T((kf)/(ki) - I) [T7J. Figs. § and g show the accuracy 
of the approximations used to get Eq.(|6]) to predict the 
tipping point in the SIR process and the change in the 
average size of the cascades in the two regimes. This sug- 
gests that the dynamical strength of the ties 7y , defined 
111 Eq. §, can be effectively used to model real strength 
of human interactions in social networks. 

In conclusion we have seen that both the bursty nature 
of human communications and the existence of group 



conversations are the two main dynamical ingredients to 
understand the spreading of information in social net- 
works. These two effects compete in the spreading, fa- 
voring and hindering information reach when compared 
with the homogeneous case. Our results indicate the ne- 
cessity to incorporate temporal patterns of communica- 
tion in the description and modeling of human interac- 
tion. Actually, we have proven an effective way to map 
the dynamics of human interactions onto a static repre- 
sentation of the social network through the concept of 
dynamical strength of ties. We believe its success in ex- 
plaining information diffusion would encourage the use of 
this dynamical strength in other areas of network research 
which is based on information spreading like the deter- 
mination of influcncc/ccntrality |2I| . community finding 
[2"2] , viral marketing [H[23], etc. 
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